perm filename PIERCE[1,JMC] blob
sn#005224 filedate 1969-11-29 generic text, type T, neo UTF8
00100 This reply to J.R. Pierce's letter (this Journal 1969)
00200 in which he proposes that researchers in speech recognition
00300 give up, is motivated more by respect for Pierce's position in
00400 the research support establishment than by the arguments
00500 contained in his letter. The points made in his letter are
00600 of two kinds, sociological and scientific. We will paraphrase
00700 and reply to these points in the order in which he presents
00800 them and then make some sociological and scientific remarks of
00900 our own. Anything in quotation marks is taken directly from
01000 Pierce's letter.
01100
01200 1. "The sort of human behavior encouraged by the lush
01300 funding of science and engineering after World War II, and
01400 especially after Sputnik, has been more imitated an elaborated
01500 than commented on or analyzed. some of the strangest aspects
01600 of post-Sputnik behavior are exhibited in work on speech
01700 recognition as well as in other intriguing fields such as
01800 space, artificial intelligence, and cybernetics." Pierce then
01900 goes on to say that the work is carried on because the subject
02000 is glamorous and one can get money for it. "To sell suckers,
02100 one uses deceit and offers glamor." No qualifications are made
02200 and no arguments are given, so no reply is possible.
02300
02400 2. Next we have the speculation that the research in
02500 speech recognition is partly motivated by a desire to have the
02600 computer play Turing's imitation game. Pierce correctly points
02700 out that it is possible to fool some of the people some of the
02800 time by a program that doesn't understand anything.
02900
03000 3. Next Pierce discusses communication with computers
03100 as a motivation for speech recognition research, beginning by
03200 saying that communicating with computers by speech is like
03300 controlling a car with gee and haw and bridle and reins, saying
03400 that we do quite well with keyboards, cards, tapes, and
03500 cathode-ray tubes. Most people talk four to ten times as fast
03600 as they can type so there is a large potential advantage here
03700 if it can be realized. Pierce, however, seems to be in a mood
03800 of not conceding anything to the enemy.
03900
04000 4. Pierce thinks the speech recognition efforts are
04100 doomed and says so as follows:
04200 "There are strong reasons for believing that spoken
04300 English is, in general, simply not recognizable phoneme by
04400 phoneme or word by word, and that people recognize utterances,
04500 not because they hear the phonetic features or the words
04600 distinctly, but because they have a general sense of what a
04700 conversation is about and are able to guess what has been
04800 said." This is buttressed by an 1899 quotation from William
04900 James who says the same thing and by anecdotes. Further: "These
05000 considerations lead us to believe that a general phonetic
00100 typewriter is simply impossible unless the typewriter has an
00200 intelligence and a knowledge of language comparable to those of
00300 a native speaker of English. This leaves a narrower question
00400 open. Are more limited, satisfactory economic applications of
00500 voice control realizable?"
00600
00700 Pierce is right in sayinWe agree that phonetic
00800 information nominally present in speech is often omitted and
00900 that typing what was said in standard English often requires
01000 understanding. There are, however, four mitigating
01100 circumstances:
01200 a. It is not yet clear how much spoken English
01300 can be transcribed using only phonetic and syntactic clues.
01400 b. It may be possible to produce a phonetic
01500 transcription of what was actually said that may be
01600 comprehensible to humans and usable for various purposes by
01700 further computer processing.
01800 c. When communicating directly with a computer,
01900 all the restrictions determining what messages make sense that
02000 are available to a human may also be available to the program.
02100 d. Research in the semantics of English is
02200 proceeding, albeit even more slowly than research in speech
02300 recognition, and whenever this research reaches a suitable
02400 stage, it can be connected with the speech recognition
02500 research.
02600
02700 5. Pierce then knocks down the argument that we may
02800 learn something about speech via speech recognition research by
02900 saying that the "recognizers" mostly behave like "mad
03000 inventors" and "untrustworthy engineers". He does not say who
03100 the exceptions are. Without actually conducting a survey we
03200 contradict him by saying that most recognizers are seriously
03300 learning about speech and that their experience in recognition
03400 studies improve our ideas about what it is important to learn.
03500
03600 However, besides learning about speech, it turns out
03700 that there is a lot to be learned about computer science,
03800 namely, how to develop procedures that work well in very
03900 complex circumstances.
04000
04100 6. the results of the experiments are often obscure.
04200 Regrettably, this is true, and the fault is often that of the
04300 experimenter. As an experimental field, computer science is
04400 quite new, and the standards for what constitutes a good
04500 experiment are not really formulated yet. However, it is
04600 worthwhile merely to exhort people to plan their speech
04700 recognition, game playing, theorem proving, picture
04800 description, and other experiments so that more will be learned
04900 than just whether the method worked.
05000
00100 7.Pierce concludes with an estimate that 95 percent
00200 recognition has been achieved with small vocabularies and few
00300 speakers, somewhat better with one speaker. He sees no
00400 economically sound application for this. Our estimate of
00500 applicability is as follows: At present (Reddy 1969),
00600 vocabularies of 100 words can be recognized in real time with
00700 95 percent accuracy on a PDP-10 computer. This is not good
00800 enough. Given, however, the next factor of ten in machine
00900 performance, programs that give visual feedback of what the
01000 machine thought it heard, and the next few years improvement in
01100 present programs, economically useful results may be achieved
01200 in limited communication circumstances like controlling a
01300 time-sharing system.
01400
01500 8. Finally, Pierce proposes some reasonable questions
01600 for people to ask themselves before they undertake speech
01700 recognition research.
01800
01900 We have included all the scientific comments we wish to
02000 make about the field of speech recognition in our answers to
02100 Pierce's points, so we will conclude with some sociological
02200 remarks.
02300
02400 1. Pierce's characterization of his letter as an
02500 appeal to people engaged in speech recognition research is
02600 disingenuous. As director of.... at Bell Telephone
02700 Laboratories he has killed of their speech recognition work,
02800 and the journalistic style of his letter warrants the
02900 interpretation that it is directed more at the suppliers of
03000 funds for such research. Pierce was chairman of the ...
03100 Committee ... that did a hatchet job on language translation
03200 research based on an only slightly more scientific study of the
03300 subject.
03400